Readme File


Data replication files for Preuhs, Robert R. and Rodney E. Hero, �State Racial/Ethnic Context and Partisan-Ideological Sorting,� SPPQ-2024-0012 (State Politics and Policy Quarterly).

The main analyses and supplemental materials can be replicated with the files included.


All analyses were conducted with Stata/MP 15.1.

Estimated runtime for all models reported in the paper and supplemental materials, excluding pre-processing and variable creation:  12 hours.


The final dataset combines individual-level data and state- and county-level data from the following sources:

Individual-level Data:  The Cumulative Cooperative Election Study 2006-2023.  Kuriwaki, Shiro, 2024, "Cumulative CES Common Content", https://doi.org/10.7910/DVN/II2DB6, Harvard Dataverse, V9.  Downloaded Feb. 24, 2025.

State- and County-level Data: American Community Survey (SCS) 5_year Averages for State- and County-level variables downloaded from Social Explorer (https://www.socialexplorer.com/) May 13-25, 2024.


READ .do FILES BEFORE RUNNING THE ANALYSIS. Given the size of the dataset and computational demands of the models,
it is generally recommended to run sections of the .do files separately for the main analysis to avoid any
lengthy delays in a crash-scenario.  It is also recommended to use a log file or other means to capture the output.  However, the .do code can be run in one batch if desired but does have a runtime of about 12 hours at common processing speeds.


Files Included:



For replication of Tables and Figures in the main paper and online appendices only, use the following files:

"Context and Sorting Full Dataset.dta" is the Stata dataset.

"Context and Sorting Replication.do"  is the Stata code for reproducing the analyses presented in tables and figures from the main dataset, "Context and Sorting Full Dataset.dta"


"Context and Sorting Contrasts.dta" is a dataset of the contrasts created for Figures 2 and 5.  Code for contrast creation is found in "Data Replication Final.do".

"Context and Sorting Contrasts.do" is the code to create Figures 2 and 5 from the "Context and Sorting Contrasts.dta" which was created from "margins, contrasts" commands in two sections of the main analyses code.




For replication from the original raw datasets used to create the full merged dataset, contact the corresponding author.
The corresponding author can provide yearly county and state ACS data as well as the original CES downloaded dataset. Corresponding merge and variable creation .do files are also available.



Additional Notes:



Excel output Files are generated with "outreg2" in a number of code lines associated with model estimation in the Context and Sorting Replication.do file.  The Excel files were utilized to create publication-quality tables.  The original Excel files are not included in the replication files, but will be produced and saved in the default working directory when running replication code.

A separate dataset, "Context and Sorting Contrasts.dta", was created by copying output from the margins, contrasts code, output in Stata.  This dataset is included, while the code for creating the graphs associated is included within the Racial Context and Sorting Contrasts.do code.  

All graphs were initially created with Stata code but were also modified for presentation purposes in Stata's Graph Editor.
 


NOTE ON SOCIAL EXPLORER DATA DOWNLOAD:

Data acquired from Social Explorer (https://www.socialexplorer.com/) were downloaded through the following steps:

From Social Explorer's main page:
1. Tables
2. American Community Surveys (5-Year Estimates) (each year was downloaded separately, with the last year of the estimates used as the year in the analysis, with the exception that 2009 was substituted for 2008). 
3. Begin Report
4. Choose State or County
5. Choose All States or All Counties
6. Variables were selected (see below for a list).
7. Data Download included "Output percents" for Tab delimited data and STATA .dct and .do files.  The .dct and .do files provided by Social Explorer are NOT included in the replication files.
8. Social Explorer provided .do files were used to create Stata files from the tab-delimited download files (.dct).

Contact the corresponding author for the code for cleaning and re-coding the data and merging with the CES.

Variables Selected for download in Social Explorer (also see the variable creation .do files for State or County-level ACS included in the replication files):

County-Level Variables in Social Explorer:


A00002_002

A04001_004

A04001_006

A04001_007

A04001_010

A12001_005

A14007_001

A17005_003

A14028_001

A06001_003

State-Level Variables in Social Explorer 

A04001_001

A00002_002

A04001_003

A04001_006
 
A04001_007

A04001_010

A12001_005

B12001_004

A14007_001

A06001_003

A17005_003

A14028_001





